Calculation of diversification indicators and other covariates

Author

Romain Frelat

Published

November 26, 2025

Summary

Indicator Data Format
perimeter and area of the field RPG vectoriel
mean field size within buffer RPG vectoriel
hedgerows length around field RPG + BD haies vectoriel
crop rotation (N-5:N) RPG + OSO raster
% land cover within buffer RPG + OSO raster
density of bordures RPG + OSO raster

Data description:

  • Registre Parcellaire Graphique (RPG)(45Gb): annual field crop data for the period 2007-2023 available at France scale on IGN website: https://geoservices.ign.fr/rpg. Definition of field (parcelles) are coherent only in the recent period 2015-2023.
  • Carte d’occupation des sols du CES OSO – THEIA (OSO)(6.6Gb): annual land cover data for the period 2016-2024. Available for France in raster format and 10m resolution https://doi.org/10.57745/UZ2NJ7. Official access through the CNES website https://geodes-portal.cnes.fr.
  • BD Haies v2 (6.8Gb): hedgerows dataset for France available on the IGN website: https://geoservices.ign.fr/bdhaie. BD Haie v2 was produced from satellite images of 2020-2022 (which is a better fit to our data than v1 from images of 2011-2014).

Field observations

Table 1: Number of observations per year and per project
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 TOTAL
BACCHUS 0 0 0 0 40 38 40 40 38 38 38 272
BIOMHE 0 0 0 0 0 0 40 0 0 0 0 40
BISCO 0 0 0 27 0 0 0 0 0 0 0 27
DIVAG 0 0 0 0 0 40 0 0 0 0 0 40
DURUM_MIX_GM 0 0 0 1 1 0 0 0 0 0 0 2
FRAMEwork_BVD 0 0 0 0 0 0 0 36 0 0 0 36
LepiBats 0 0 0 0 0 0 0 50 0 0 0 50
MUESLI 0 0 60 0 0 0 0 0 0 0 0 60
OSCAR 0 0 0 0 15 33 38 67 88 100 107 448
SEBIOPAG_BVD 0 0 0 0 0 0 20 20 20 0 0 60
SEBIOPAG_Plaine de Dijon 20 20 20 20 20 20 20 20 20 20 20 220
SEBIOPAG_VcG 19 19 17 17 17 17 17 17 17 17 0 174
SEBIOPAG_ZAAr 20 20 20 0 20 0 0 20 0 20 0 120
SERIPAGE 0 0 9 0 0 0 0 0 0 0 0 9
TOTAL 59 59 126 65 113 148 175 270 183 195 165 1558

Because of data availability (RPG is not released yet for 2024 and OSO is not available before 2016), we will only focus on the period 2016-2023. There were 1275 observations made between 2016 and 2023.

Figure 1: Map of field observations

Indicators from vector datasets

Identification of the crop field in RPG

Table 2: Number of observations in fields from RPG
Nobs in_RPG Perc
BACCHUS 234 189 80.77
BIOMHE 40 39 97.50
BISCO 27 26 96.30
DIVAG 40 40 100.00
DURUM_MIX_GM 2 0 0.00
FRAMEwork_BVD 36 30 83.33
LepiBats 50 30 60.00
MUESLI 60 31 51.67
OSCAR 341 312 91.50
SEBIOPAG_BVD 60 51 85.00
SEBIOPAG_Plaine de Dijon 160 160 100.00
SEBIOPAG_VcG 136 48 35.29
SEBIOPAG_ZAAr 80 80 100.00
SERIPAGE 9 9 100.00

In total, 82 % of the fields observations are covered by RPG data. There are large disparities among projects with SEBIOPAG_VcG, MUESLI and LepiBats having a lower coverage than 60%. The project DURUM_MIX_GM has only one coordinates leading to the entrance of the Institut Agro-Montpellier.

Figure 2: Case of SEBIOPAG_VcG observations in 2023 with overlaied RPG

To be discussed:

Some coordinates were taken at the edge or on the boundary of the field, so it is not possible to clearly identify the field. In such case, should we consider the closest field within a distance threshold (e.g. 10m)?

Field size

Table 3: Summary statistics per project of the area (in ha) of crop fields
BACCHUS BIOMHE BISCO DIVAG FRAMEwork_BVD LepiBats MUESLI OSCAR SEBIOPAG_BVD SEBIOPAG_Plaine de Dijon SEBIOPAG_VcG SEBIOPAG_ZAAr SERIPAGE
Min. 0.26 0.68 0.50 0.97 0.36 1.40 0.41 0.23 0.36 0.53 0.19 1.21 1.55
1st Qu. 0.98 1.59 1.10 2.21 0.56 2.77 1.91 0.48 0.83 5.11 1.82 2.96 2.19
Median 2.38 3.69 1.71 2.98 1.36 6.18 4.10 1.10 3.68 6.83 4.16 4.31 3.51
Mean 4.52 4.48 3.22 3.17 3.83 11.89 5.64 1.71 5.70 7.41 6.40 4.83 4.04
3rd Qu. 6.16 6.29 3.15 4.02 5.01 13.96 7.61 1.96 5.20 8.78 9.90 6.09 5.19
Max. 39.27 13.99 23.69 6.01 18.15 55.47 34.05 15.60 29.02 17.82 18.35 16.22 7.60
Table 4: Summary statistics per project of the perimeter (in m) of crop fields
BACCHUS BIOMHE BISCO DIVAG FRAMEwork_BVD LepiBats MUESLI OSCAR SEBIOPAG_BVD SEBIOPAG_Plaine de Dijon SEBIOPAG_VcG SEBIOPAG_ZAAr SERIPAGE
Min. 0.26 0.68 0.50 0.97 0.36 1.40 0.41 0.23 0.36 0.53 0.19 1.21 1.55
1st Qu. 0.98 1.59 1.10 2.21 0.56 2.77 1.91 0.48 0.83 5.11 1.82 2.96 2.19
Median 2.38 3.69 1.71 2.98 1.36 6.18 4.10 1.10 3.68 6.83 4.16 4.31 3.51
Mean 4.52 4.48 3.22 3.17 3.83 11.89 5.64 1.71 5.70 7.41 6.40 4.83 4.04
3rd Qu. 6.16 6.29 3.15 4.02 5.01 13.96 7.61 1.96 5.20 8.78 9.90 6.09 5.19
Max. 39.27 13.99 23.69 6.01 18.15 55.47 34.05 15.60 29.02 17.82 18.35 16.22 7.60
Figure 3: Relation between area and perimeter

There is a strong relation between area and perimeter (Figure 3). In median, field size is 2.8 ha and field perimeter is 790m.

Outliers

Figure 4: Abnormally large perimeter
Figure 5: Abnormally small field

To be discussed:
Some fields are defined as Bordure de champ which are not field but borders (as in Figure 5). Should we remove fields from RPG that are not agricultural before running the calculations?

Hedgerows length

Using the field as defined by RPG, we can calculate the length of hedgerows from BD Haies that intersect the field (+ a small buffer).

Table 5: Summary statistics of the hedgerows length (in m) within different buffer size around the field
B_0m B_5m B_10m
Min. 0.00 0.00 0.00
1st Qu. 0.00 0.00 0.00
Median 0.00 30.97 68.36
Mean 82.00 167.59 213.70
3rd Qu. 51.00 201.30 288.28
Max. 2935.23 4336.95 5105.47
NA’s 230.00 230.00 230.00
PercWithHedges 42.11 60.00 70.14

The 230 NA’s correspond to the observations from which no corresponding fields were found. Without buffer, 42% of fields have hedgerows within the field. This percentage increases up to 70% if we consider a 10m buffer around the field.

Figure 6: Correlation among hedgerows lengths per buffer size

Outliers

Figure 7: Field with hedgerows at 5m buffer
Figure 8: Field with hedgerows at 5-10m

To be discussed:

  • Which buffer size should we use to calculate the hedgerows lengths? Without buffer, it might be too restrictive, but is 10m to large, or not enough?
  • Should we consider the position of the field sampling when calculating the hedgerows length?

Field size within buffer

Table 6: Summary statistics of the field area (in ha) within different buffer size
B_500m B_1000m B_1500m
Min. 0.22 0.26 0.30
1st Qu. 1.50 1.57 1.55
Median 2.44 2.33 2.30
Mean 3.00 2.63 2.51
3rd Qu. 3.69 3.15 2.93
Max. 22.25 11.82 12.61
NA’s 13.00 7.00 5.00

We see that some observations don’t have crop field within 500m (N=13). To be checked whether those observations (listed in Table 7) were really made close to an agricultural field.

Table 7: Observations with no fields within a 500m buffer.
Study_ID Site Year
154 DURUM_MIX_GM DIASCOPE 2017
232 DURUM_MIX_GM DIASCOPE 2018
704 LepiBats C01 2021
705 LepiBats C02 2021
706 LepiBats C03 2021
707 LepiBats C04 2021
708 LepiBats C05 2021
709 LepiBats C06 2021
710 LepiBats C07 2021
712 LepiBats C09 2021
713 LepiBats C10 2021
242 OSCAR 33_2011_00002 2018
1133 OSCAR 11_2023_00004 2023
Figure 9: Correlation among field areas per buffer size

Outliers

Figure 10: Highest average field size within 1500m buffer
Figure 11: Lowest average field size within 1500m buffer
Figure 12: Large differences between 1000 and 1500m buffer
Figure 13: Large differences between 500 and 1000m buffer

Summary and questions about vector indicators:

  • Most observations fit within RPG dataset (Table 2).
  • But some coordinates were taken at the very edges of field (Figure 2), so we might need to identify the closest field instead (and add a distance threshold, e.g. 10m).
  • Adding the RPG complété require more data processing, and in any case it won’t cover all observations (but it will complete some wineyards). The RPG classes might also be less consistent within our timeframe and would require to be further checks.
  • We might need to exclude some fields from RPG (e.g. Bordure, Bande tampon, Culture sous serre, Bois paturés, Surface non agricole, Truffière) to only includes crop fields that are relevant for us.
  • The position of the observations within the field might influence the results (influence of hedgerows, or of agricultural practices). We might want to add an indicator reflecting the distance to the center of the field and/or the distance to the closest field boundary?

Indicators from raster datasets (RPG+OSO)

Crop rotation (N-5:N)

Table 8: Land cover data sources for the observations at year N to N-5
inRPG inOSO NAs
lulc_N 1042 233 283
lulc_N-1 1100 214 244
lulc_N-2 1028 221 309
lulc_N-3 923 213 422
lulc_N-4 779 209 570
lulc_N-5 632 181 745
Table 9: Most commun land cover classes
landcover class N
RPG_Vigne (sauf vigne rouge) 484
RPG_Blé tendre d’hiver 165
RPG_Autre verger (y compris verger DOM) 79
OSO_Vignes 64
OSO_Forêts de feuillus 41
OSO_Prairies 41
RPG_Maïs (hors maïs doux) 34
RPG_Orge d’hiver 29
RPG_Maïs ensilage 26
RPG_Mélange de céréales ou pseudo-céréales d’hiver entre elles 25
RPG_Vigne : raisins de cuve non en production 24
RPG_Colza d’hiver 21
Figure 14: Land cover at year N
Figure 15: Crop rotation in the period N:N-5

There are 648 observations with complete time series from year N to N-5. From these observations with complete rotation information, 342 have the same crop group for the whole time period, while 74 fields have four different crop groups in the past 6 years.

Table 10: Number of crop groups in the period N:N-5 for observations with complete data
BACCHUS FRAMEwork
BVD
LepiBats OSCAR SEBIOPAG
BVD
SEBIOPAG
Plaine de Dijon
SEBIOPAG
VcG
SEBIOPAG
ZAAr
1 116 28 39 119 35 3 2 0
2 0 7 8 71 4 6 9 19
3 0 1 3 46 1 16 17 12
4 0 0 0 18 0 29 19 8
5 0 0 0 1 0 6 4 1

FALSE  TRUE  <NA> 
   33  1012   513 

Land cover within buffer

Table 11: Summary of the land cover buffer composition
buffer_500 buffer_1000 buffer_1500
n_classes 186 220 237
av_perc_rpg 50 47 45
Figure 16: Average land cover per buffer size
Figure 17: Average land cover per dataset with buffer of 500m
Figure 18: Average land cover per dataset with buffer of 1000m
Figure 19: Average land cover per dataset with buffer of 1500m

Density of bordures

To be defined.

Summary and questions about vector indicators:

  • There are up to 237 land cover in the RPG+OSO dataset. Here we simplified it using the Référentiel des cultures as an illustration. Before the extracted information can be usefull in the project, it requires further work on land cover class homogeneization.
  • The edge density needs further thinking.